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Atom-centered point charge model of the molecular electrostatics—a major workhorse of the 
atomistic biomolecular simulations—is usually parameterized by least-squares (LS) fitting of the 
point charge values to a reference electrostatic potential, a procedure that suffers from numerical 
instabilities due to the ill-conditioned nature of the LS problem. To reveal the origins of this ill- 
conditioning, we start with a general treatment of the point charge fitting problem as an inverse 
problem, and construct an analytical model with the point charges spherically arranged according to 
Lebedev quadrature which is naturally suited for the inverse electrostatic problem. This analytical 
model is contrasted to the atom-centered point-charge model that can be viewed as an irregular 
quadrature poorly suited for the problem. This analysis shows that the numerical problems of 
the point charge fitting are due to the decay of the curvatures corresponding to the eigenvectors 
of LS sum Hessian matrix. In part, this ill-conditioning is intrinsic to the problem and related 
to decreasing electrostatic contribution of the higher multipole moments, that are, in the case 
of Lebedev grid model, directly associated with the Hessian eigenvectors. For the atom-centered 
model, this association breaks down beyond the first few eigenvectors related to the high-curvature 
monopole and dipole terms; this leads to even wider spread-out of the Hessian curvature values. 
Using these insights, it is possible to alleviate the ill-conditioning of the LS point-charge fitting 
without introducing external restraints and/or constraints. Also, as the analytical Lebedev grid 
PC model proposed here can reproduce multipole moments up to a given rank, it may provide a 
promising alternative to including explicit multipole terms in a force field. 


I. INTRODUCTION 

The atom-centered point charge (PC) model of molec¬ 
ular electrostatics has been a mainstay of biomolecu¬ 
lar simulations for decades.fi 14] While chemically in¬ 
tuitive and straightforward in technical implementation, 
this model does not provide a sufficiently detailed de¬ 
scription of the anisotropic features of the molecular elec¬ 
trostatic potential (MEP), such as lone pairs, 7r-systems, 
and er-holes, etc. which are mostly governed by higher- 
order multipole terms. [15, 16] These anisotropic effects, 
however, can be described within the PC approximation 
by moving beyond the atom-centered paradigm, i.e. by 
adding non-atom centered PCs/extended points. [17-20] 
Although increasing the number of PCs per atom im¬ 
proves the quality of the electrostatic model, it also can 
exacerbate well-known ill-conditioning and redundancy 
problems[21-23] of the PC fitting procedures, leading to 
numerically unstable solutions. [3, 24, 25] 

These numerical instabilities are usually related to a 
large variation of the PC values for atoms in the interior 
of the molecule, so-called buried atom effect. [3, 4, 26, 27] 
The buried atom (usually methyl and methylene carbons) 
charges can dramatically change due to trivial changes in 
the PC fitting problem (the probe grid sampling, spatial 
orientation of the molecule, etc.), and/or have inconsis¬ 
tent values across very similar molecules or even con- 
formers of the same molecule. [28, 29] As the inclusion 
of non-atom centered PCs into the model produces even 
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more buried centers, it should also increase the numerical 
instabilities of the PC fitting problem. 

In fact, these numerical problems are rooted in the 
mathematical nature of the PC derivation —the least 
squares (LS) fitting to the reference MEP:[1, 2] 

X 2 (q) = \&~ Aq | 2 = |$| 2 + g T ■ q + q T Hq, (1) 
g = -2AT$, (2) 

H = A T A, (3) 

where the LS sum y 2 is the subject of minimization and 
the solution satisfies normal equations: [30] 

A T Aq = A T $. (4) 

Here, the elements of the LS matrix A correspond to 
the inverse distance 1 /r»j between the PC i and the grid 
point j: $ is T-dimensional vector of the reference values 
of MEP; q is IV-dimensional vector of the PC values; g 
is the gradient of the function y 2 at the origin (q = 0); 
H is the Hessian matrix of LS sum y 2 . 

While the ill-conditioning is common to many LS 
fitting problems, [31-33] numerical difficulties associ¬ 
ated with PC fitting are further compounded by com¬ 
monly used total charge constraint using Lagrange 
multiplier. [21, 22, 34, 35] 

One of the most widely used techniques to alleviate the 
numerical instabilities of PC fitting is to add artificial re¬ 
straints to the PC values of the buried atoms. [3, 6, 36, 37] 
Although this method can be extended to models with 
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off-center PCs/extended points, one may wonder if it 
would be possible to overcome these difficulties in a more 
elegant way, based on better physical understanding of 
the problem. 

For instance, an important insight can be gleaned from 
the eigendecomposition of the LS sum Hessian matrix 
(eq. 3): 


Hu,; = KjUj. (5) 

Indeed, the ill-conditioned nature of the LS matrix A 
can be related to the significant differences in the eigen¬ 
values Hi , i.e. the LS sum curvatures along the direc¬ 
tions defined by the eigenvectors u,.[38, 39] Because of 
the 2-3 order of magnitude variation of the k, values, 
different sets of PCs can produce essentially the same 
MEP, as these solutions have the same positions along the 
high-curvature directions, although the positions along 
the low-curvature directions could be quite different. [38] 
Importantly, the eigenvectors with the largest curvatures 
usually correspond to the total charge and dipole moment 
components of the molecule, while the lower-curvature 
eigenvectors do not seem to be associated with particu¬ 
lar multipole moments. [38, 40] 

However, the exact physical origin of the correspon¬ 
dence between the large curvature eigenvectors and the 
first terms of the multipole expansion is unclear, along 
with the nature of the low-curvature eigenvectors. Partic¬ 
ularly, it is not clear if the presence of the low-curvature 
modes of the H matrix and thus the ill-conditioning of 
the LS problem is solely because of the nature of the PC 
fitting problem, or due to some numerical factors, e.g. an 
incomplete sampling of the reference MEP grid. 

To address these questions, we here revisit the PC fit¬ 
ting problem from the first principles. While the atom- 
centered PC model traces back to the intuitive chemical 
concept of the atomic charge, we consider a general PC 
model as a case of the inverse problem, where one seeks 
to recover the source charge distribution from its effect, 
i.e. electrostatic potential distribution. Based on the 
properties of the Coulomb law, we construct a best-case 
electrostatic model for which the inverse problem can be 
solved exactly, both in the continuous case, as well as in 
the case of a discrete (non-atom centered) PC approxi¬ 
mation. 

Using this model, we investigate the nature of the 
eigenvectors u* and their eigenvalues and dissect the 
factors responsible for the ill-conditioning of the LS fit¬ 
ting problem, and discuss how these insights can be used 
to improve and simplify the existing PC derivation pro¬ 
cedures. 


II. POINT CHARGE FITTING AS AN INVERSE 
PROBLEM 

A problem where given an effect (in this case the MEP 
<f>) defined in the region V$, its cause (a charge distribu¬ 
tion p) defined in the region V p needs to be determined 


belongs to a general class of inverse problems and can be 
described by the Fredholm integral equation of the first 
kind: [41] 



k( r, r')p (r') dr' 


$ (r), 


( 6 ) 


where kernel k( r, r') specifies the evolution of the cause 
p( r') into the effect 4>(r), that in this case corresponds 
to the Coulomb law: 


k( r, r') 



( 7 ) 


The integral equation can also be represented as an 
operator equation: 


Kp = <f>, (8) 

where K : U —> V is a linear operator defined on space 
U = range (A*) £ L 2 of square integrable functions, and 
takes values in space V = range(A) £ L 2 ; I<* : V —» U is 
adjoint of K. This equation can be solved exactly if and 
only if $ £ V. However, in general it is not the case, so 
a function p that minimizes the residual norm |$ — Kp\ 
is considered as the LS solution and thus satisfies the 
normal equation: [41, 42] 


K*Kp = K*<f>. 


( 9 ) 


This LS solution can be obtained as the linear combi¬ 
nation of the basis vectors m £ U: [41] 

„=*•** = £;»*>*, (10) 

where K' is the Moore-Penrose inverse, /.q is a singu¬ 
lar value, Vi and iq are left and right singular vectors, 
respectively and the inner product is defined as 

(<F,i>j)= / $(r)t;i(r)dr (11) 

Jv * 

The orthogonal bases {iq}*^ and also form 

the eigenbases of K*K and KK* with eigenvalues p 2 : 

K*Kui = p 2 u h (12) 

KK* Vi = p 2 iVi . (13) 

To obtain a numerical solution to the integral equa¬ 
tion (eq. 6), the regions over which the MEP and charge 
distribution are defined are sampled using a numerical 
quadrature. Given N quadrature nodes over the charge 
distribution and T nodes over the MEP region the in¬ 
tegral equation is transformed into a system of T linear 
equations: 


Kq= $ 


(14) 
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where the T x N matrix K is identical to the LS matrix 
A from eq. 1 and contains the kernel elements kij , as this 
matrix originates from the kernel k( r,r') in the integral 
equation (eq. 6). It will be further referred to as K in 
order to highlight its mathematical origin. 

Then, the PC value at the node i is the product of the 
charge density pt and the quadrature weight wp 

qi = PiWi (15) 
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Since the number of the reference values T is usually 
larger than the number of the unknown PC values N, 
the system of linear equations is overdetermined. Then, 
a solution that minimizes the LS sum \ 2 (q) (eq. 1) and 
satisfies normal equations (eq. 4) is considered as the 
numerical solution to the integral equation (eq. 6). This 
solution can be obtained using singular value decompo¬ 
sition (SVD) of matrix K: [30, 39, 42] 

q = ——— u», (16) 


FIG. 1. Schematic representations of the probe Sr and 
charged S a spheres in the continuous (A) and discrete (B) 
forms. Operators K (eq. 28) and matrix K are represented 
schematically. 


III. THE TWO-SPHERE MODEL 

The Coulomb kernel (eq. 7) can be conveniently ex¬ 
panded in terms of spherical harmonics so the source r / 
and the observation r coordinates are separated but share 
the same origin: [44, 45] 


where K' is the Moore-Penrose pseudoinverse; p, are sin¬ 
gular values of matrix K; vectors v* and u, are left and 
right singular vectors. If the rank r of matrix K is less 
than the dimension of q (r < A), then the matrix K is 
rank deficient. 

Similarly to the continuous case (eqs. 12-13), the or¬ 
thogonal bases {vi}\ =0 and {iti}f =0 form eigenbases for 

KK T and K T K: 


KK T V j = 


(17) 


k( r, r 7 ) 



OO l 

= E E 


l —0 m=—l 


47t 

2 1 + 1 r^ 1 


Yi m {r')Yi m (r), 


( 20 ) 


where r = r/r denotes the unit vector defined by the 
polar ip and azimuthal 6 angles; r< is the smaller and r> 
is the larger of r and r'\ Yi m are orthogonal real-value 
spherical harmonics: [46] 


K T Kuj = /z 2 Ui, (18) 

where K T K is also a Hessian matrix (eq. 5) and pj is 
identical to its eigenvalue Kj, which is the y 2 curvature 
along the direction u, : [4.3] 


Yi m (r)Yi lm i(r)dn = dwSmm', ( 21 ) 

Js 

where dfl is the differential of the solid angle. 

Then, in the region beyond the divergence sphere 
where the charge density vanishes, the MEP can be ex¬ 
panded in a multipole series: [44, 45] 


Mi = «i (19) 

In many LS problems, PC fitting included, the singu¬ 
lar values vary in a wide range, revealing the underlying 
ill-conditioning. [21, 22, 31, 32] As a singular value /q is a 
denominator in the LS solution (eq. 16), the smaller the 
singular value, the larger the effect of the correspond¬ 
ing singular vector u,; on the LS solution. Thus, even 
small variations along iq with small singular value lead 
to a significant variations of the LS solution, although 
these variations do not lead to significant change in the 
quality of the fit y 2 .[38] To understand the origins of the 
ill-conditioning in PC fitting, we next consider a system 
for which the inverse electrostatic problem can be ana¬ 
lytically solved. 


0° l I . - 

= E E v ( 22 ) 

1=0 m=—l V 

where a molecular multipole moment Q™ 1 is given by 

Q Z 1 = I r l p(r)Y lm (r)d 3 r. (23) 

The form of the kernel expansion (eq.20) suggests that 
if the radii r = R and r' = a are fixed, the kernel fc(R, a) 
can uniquely map a charge density over a spherical sur¬ 
face S a to the corresponding potential ( 1 > (R) on a sphere 
Sr and vice versa. Thus, for a probe sphere Sr with the 
radius R greater than the radius of divergence sphere, 
the MEP can be reproduced exactly by a sphere S a with 
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surface charge density <j(a) such that the multipole mo- 
s 

ments of the sphere Q^ are equivalent to the multipole 
moments of the molecule Q™ 1 - 

Qt = QZ l , (24) 

where multipole moments of the sphere are: 

Q?m = j s o-( a )yim(a)dft. (25) 

In this case, the original integral eq. 6 is transformed 
into a surface integral equation: 


's a 


k( R, a)a(a)dCl = <f>(R), 


(26) 


or, equivalently, in an operator form 


Kcr = (27) 

where K : L 2 (S a ) —>• L 2 (Sr) is a compact infinite-rank 
operator (Fig. 1A): 

OO l 

^ = EE (28) 

l —0 m=—l 


where subscripts S a and Sr denote the spheres, on which 
the corresponding spherical harmonics are defined; the 
projection (a , is the inner product on the L 2 (S a ) 
space: 

*(a)l£«(a)dfi (29) 

JS a 

and for each degree l there is a singular value Hi in the 
form of the distance-dependent factor from the MEP ex¬ 
pansion (eq. 20): 


47t a 1 
21 + 1 R l+1 ' 


(30) 


Accordingly, the spherical harmonics and are 
left and right singular vectors and thus the eigenfunctions 
of the operators K*K and KK*, while the squares of 
the singular values m are their eigenvalues (eqs. 12-13). 
Since the singular values /j; and spherical harmonics Y^° 
and form a singular system of the operator K , the 
solution to integral equation (eq. 26) can be expressed 
as: 


* = *'* = £ t (si) 

1=0 m— — l W 

According to the multipole expansion (eq. 22), the 
inner product ('!>, Y^*) depends on the radius R of the 
probe sphere and the multipole moments of the molecule: 


{*A«) = J s * $(R)*£«(R)rfn = \ l^l~ 1R ! +1 QZ l - 

(32) 


The dependence on the radius R cancels out, so the 
charge density depends only on the radius a of the sphere 
S a and the molecular multipole moments: 

OO ^ /"o 7 T" 

*(a) = E E ( 33 ) 

l —0 m=—l 

and the charged sphere S a exactly reproduces the MEP 
$(R). 


IV. ANALYTICAL LEBEDEV GRID POINT 
CHARGE MODEL 

We can construct an approximate discrete analog of 
the two-sphere model (eqs. 26-33, Fig. 1) using a 
quadrature that exactly integrates spherical harmonics 
Y lrn over a sphere up to a given l (eqs. 29 and 32), e.g. 
the widely used[47-49] Lebedev quadrature, [50] that de¬ 
fines N quadrature nodes (Table I in the supplementary 
material[51]) with predetermined angular coordinates 
ipi, and integration weights Wf. 

N 

/ Yi m (0,ip)dQ = y ^Yi m (di,ipi)wi. (34) 

Then, given the surface charge density eq the correspond¬ 
ing point charge is: 


= crjWj. (35) 

Due to the orthogonality of the spherical harmonics 
Yim (eq. 21), the IV-node Lebedev quadrature that ex¬ 
actly integrates spherical harmonics over the sphere S a 
up to l = 2n 

N 

(36) 

defines an orthonormal basis: 

Ys a = { Yf*, —l<m<l }r =0 , (37) 

where the Yjr^ vectors have d n = (n + l) 2 elements de¬ 
fined as: 


^Imi ~ Y im(0i,+i)\JW i “. (38) 

Similarly, the probe sphere Sr can be represented by a 
T-node Lebedev grid that integrates spherical harmonics 
up to l = 2t and defines an orthogonal basis Y s R of 
dimension d t = (t + l) 2 . 

In this discrete representation, the operator K (eq. 28) 
then becomes a T x N matrix K: [52] 


( 39 ) 



5 


where the elements of K, a, and $ are: 


IUj = y/w^w^/nj, (40) 



Since usually the probe grid has more points than the 
source grid, i.e. T > N, the matrix equation (eq. 39) is 
a LS problem (eq. 1) that can be solved using SVD of 
the matrix K (eq. 16), giving a discrete analog of eq. 33: 


a = 


EE ■ 

1=0 m=—l 


$ Y 


Sr 


M 


.Y Sa 

1 im> 


(42) 


where and Y^ are left and right singular vectors, 
and the corresponding singular values /q are the same as 
in the continuous case (eq. 30). 

Since we use the Lebedev quadrature, the dot product 
4> ■ Yf « corresponds to exact numerical integration and 
gives a result identical with the continuous case (eq. 32): 

= = \/^tr5tO™‘' < 43 > 

3 =0 

so the solution to eq. 39 depends only on the radius a 
and the multipole moments Q™ l : 

* = E E \f^ a ~ l QZ l Yt- (44) 

1=0 m=—l 


The corresponding PC values qj can be obtained using 
the quadrature weights w ^ a : 



or, in a vector form: 


q = E E \f^ a ~ l QZ l Yt © w s “, (46) 

Z—0 m=—l 

where w Sa is the vector of the quadrature weights for 
the sphere S a . Therefore, we can use Lebedev grid that 
shares the origin with a molecule to construct an analyt¬ 
ical PC model that exactly reproduces molecular multi- 
pole values up to the degree n. 

From this model, we can see that the ill-conditioning 
of the PC fitting due to the decay of the singular values 
is intrinsic to the inverse electrostatic problem, as the 
singular values jii decrease with increasing l (eq. 30). 
Indeed, the higher the multipole moment, the smaller its 
contribution to the overall electrostatic potential. Also, 
this contribution gets smaller as we move the probe fur¬ 
ther away from the source, and the singular values get 
smaller with the increasing radius of the probe sphere R , 
or decreasing radius of the source sphere a. 


The ill-conditioning problems become even more severe 
as we switch from modeling the MEP using the Lebe¬ 
dev quadrature, which is the best suited to reproduce 
the molecular multipoles, to an irregular atom-centered 
quadrature, as shown on a numerical example below. 

A • • • B 

• • 

• • • : 

• • 

• • • 

Sr grid vdW grid 

FIG. 2. Cross-section representations of the quadratures used 
for two-sphere model (A) (n = 1, N = 6 and t = 11, T = 194 
for spheres S a and Sr, respectively) as compared with the tra¬ 
ditional atom-centered model (B). Green circles correspond to 
the point charges; blue circles correspond to the reference grid 
points 
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V. LEBEDEV GRID VS. ATOM-CENTERED 
MODEL: A NUMERICAL EXAMPLE 


First, we consider an electrostatic PC model of a 
methanol molecule with PCs placed at the nodes of the 
Lebedev quadrature over the sphere S a (a = 2 au) (Fig. 
2A). In this case, the PC values can be obtained analyt¬ 
ically from the reference multipole moments (eq. 46) or 
by numerical fitting to the reference MEP over the probe 
sphere Sr (R = 8 au, t = 11, T = 194): 


* = E 


-u 

/q 


(47) 


where the PC value can be found as qj = CTjyjw^ a and 
the maximum rank r is the number N of quadrature 
nodes/PCs over the sphere S a . The quality of the fit is 
measured using the root mean square deviation (RMSD) 
calculated over the T nodes of the probe grid: 


RMSD = 



(48) 


Naturally, the analytical PC values from eq. 46 exactly 
reproduce the molecular multipole moments up to the 
degree n defined by the quadrature (Table I). For each 
degree l there are 21 +1 values of order m, so overall (n + 
l) 2 multipole moments are reproduced, which matches 
the dimension d n of the corresponding basis Yg a (eq. 37). 
As the dimension d n increases, more multipole moments 
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FIG. 3. Normalized singular values Hi/H i obtained using the 
exact analytical expression eq. 30 (green circles) as com¬ 
pared with the numerical values obtained from SVD of the 
LS matrix for the two-sphere model (red stars) and for atom- 
centered model (black circles). Lebedev quadratures with 
n = 1, N = 6 and t = 11, T = 194 were used for the charged 
S a (a = 2 au) and probe Sr (R = 8 au) spheres, respectively. 

are reproduced and the RMSD rapidly approaches zero 
(Fig. 4 in the supplementary material[51]). 

Since the dimension d n does not match the number 
of quadrature nodes N (Table I in the supplementary 
material[51]),[53, 54] we can obtain numerical solutions 
with eq. 47 that are equivalent to the analytical results 
(eq. 46) by setting the rank r to the dimension of the 
grid, d n = (n + l ) 2 (Table I).[55] 

As the first d n multipole moments Q are reproduced 
by the PC model, the first d n numerical singular values 
Hi exactly match the radius-dependent part (eq. 30) from 
the inverse distance expansion (Fig. 3), and the corre¬ 
sponding right singular vectors match the basis Y s a 
(Fig. 4): 

= {Y&, -l < m < OLo- (49) 

If we do not restrict the rank r to the dimension of 
the grid d n , numerical SVD of the LS matrix K (eq. 47) 
produces N singular vectors/values. While this slightly 
improves the RMSD (Table I), the additional N — d n sin¬ 
gular vectors cannot be described analytically (Fig. 4), 
as they go beyond the dimension d n of the corresponding 
basis Yg a . However, in the fortuitous case of the quadra¬ 
ture with n = 1 and N = 6 , the remaining 6 — 4 = 2 
vectors resemble the basis vectors Y 2 _2 and Y 2 _i, so 
the corresponding quadrupole moments Q 2-2 and Q 2-1 
are accurately reproduced, although the exact numerical 
integration of the spherical harmonics Y 2-2 and Y 2 -i is 
not provided by the 6 -node Lebedev grid. 

Now, we can use the insights from the best-case sce¬ 
nario spherical PC model based on the Lebedev quadra¬ 
ture (Fig. 2A) to understand the traditional PC fitting 
problem with atom-centered charges and the probe grid 
that follows the solvent-accessible surface (vdW grid, Fig. 
2B). From the point of view of the inverse electrostatic 
model, the atom-centered PC fitting corresponds to a 


numerical solution using an irregular and suboptimal in¬ 
tegration grid to represent the source charge distribution. 
This problem can be treated by SVD of the LS matrix 

K: 

„ = (50) 

i=l /i! 

where the maximum value of rank r is the number of 
atoms in the molecule, i.e. r = 6 in the case of methanol. 

We can see that even in this case the singular vector Ui 
with the largest singular value Hi corresponds the total 
charge (Fig. 4), which is reproduced with only a slight 
slight numerical deviation (< 0 . 01 ), a consequence of the 
molecular charge density spillover beyond the solvent- 
accessible surface defining the vdW grid. [38, 40] 

Although the other singular vectors do not exactly 
match the corresponding spherical harmonics, the u 2 114 
vectors can be roughly related to the three components of 
the dipole moment (Fig. 4), and the corresponding sin¬ 
gular values are commensurate with the singular value 
Hi (l = 1) obtained for the Lebedev grid model (Fig. 3). 
The remaining singular values H 5 and 1*6 are significantly 
distorted from the singular value Hi {l = 2 ), so the com¬ 
ponents of the quadrupole moment are not reproduced as 
precisely as the the dipole moment components (Table I). 

Among all singular vectors {uj}f =1 , the singular vec¬ 
tor Ug with the lowest singular value /ig, which is 100 
times smaller than /ii, is dominated by the contribution 
from the methyl carbon atom (Fig. 4). Since such small 
singular values cause numerical instabilities of the LS so¬ 
lution, once can use a regularization technique such as 
truncated SVD (tSVD) that reduces the rank r by remov¬ 
ing the lowest-Hi vector(s) from the SVD expansion. [39] 
Removal of Ug that decreases the rank to r = 5 leads to 
dramatic change in the methyl group charges—the car¬ 
bon atom charge in particular, which drops from 0.22 to 
—0.06. Yet, these changes lead only to marginal changes 
in the the multipole moment and RMSD values, a typi¬ 
cal example of the buried atom effect (Tables I, II). This 
suggests a natural way to impose a restraint on the buried 
atom charges without introducing a restraining function 
into the LS sum \ 2 , an addition that can negatively affect 
the electrostatic properties of the PC model. [22, 56] 
Further removal of the singular vectors U 5 and U 6 (i.e. 
r = 4) leads to severe deterioration of the LS solution, 
as the corresponding multipole moment strongly deviate 
from the reference values and the RMSD significantly 
increases (Tables I, II). Thus, it appears that the tSVD 
approach should be applied only to the singular vectors 
that strongly depend on the buried atoms, an important 
point that will be discussed in detail elsewhere. 

VI. THE TOTAL-CHARGE CONSTRAINT 
REVISITED 

Commonly used PC fitting approaches also modify the 
LS sum (eq. 1) by adding a Lagrange multiplier A in order 
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spherical harmonics two-sphere model 

^ {Y lm , l<m< 1 }Lo & {5*}f =0 

^ 


Yoo 

Y 10 

Yu 

Y1-1 

Y 2 -2 

Y 2 -l 

0.41 

0.0 

0.0 

0.71 

0.79 

-0.46 

0.41 

0.0 

0.0 

-0.71 

0.79 

-0.46 

0.41 

0.71 

0.0 

0.0 

-0.79 

-0.46 

0.41 

-0.71 

0.0 

0.0 

-0.79 

-0.46 

0.41 

0.0 

0.71 

0.0 

0.0 

0.91 

0.41 

0.0 

-0.71 

0.0 

0.0 

0.91 


u 0 

U 1 

U 2 

U 3 

U 4 

u 5 

0.41 

0.0 

0.0 

0.71 

0.49 

-0.31 

0.41 

0.0 

0.0 

-0.71 

0.49 

-0.31 

0.41 

0.71 

0.0 

0.0 

-0.51 

-0.27 

0.41 

-0.71 

0.0 

0.0 

-0.51 

-0.27 

0.41 

0.0 

0.71 

0.0 

0.02 

0.58 

0.41 

0.0 

-0.71 

0.0 

0.02 

0.58 


C 


atom-centered model 

{Mi=o 


u 0 

U 1 

u 2 

U 3 

U 4 

U 5 

0.41 

0.16 

-0.01 

0.0 

-0.1 

0.89 

0.41 

0.24 

-0.43 

0.73 

-0.05 

-0.24 

0.41 

0.25 

-0.49 

-0.69 

-0.05 

-0.24 

0.41 

0.45 

0.68 

-0.03 

0.34 

-0.23 

0.41 

-0.42 

0.32 

-0.02 

-0.72 

-0.19 

0.41 

-0.69 

-0.08 

0.0 

0.59 

0.0 


FIG. 4. The orthonormal bases of the right singular vectors: basis of spherical harmonics Y s a (A), basis from the numerical 
SVD of the LS matrix in two-sphere PC model (B), and atom-centered model (C). 


to constrain the total charge to the correct value: [21, 34, 
35] 

X' 2 (q) = |^ — Kq| 2 + A(1 T • q — Qo), (51) 

which increases the dimension of the Hessian matrix H = 
K T K in the normal equation (eq. 4): 


where n is the index of the eliminated charge. This re¬ 
duces the dimension of the LS problem by one: 


x 2 (q) = £ 




Q 


mol 

0 


' nj 


N -1 

-£ 


' nj 


Qi 


(54) 


H 

1 


q 


KT$ 

IT 

0 


A 


< 2 ° 


where 1 is an all-ones column-vector. 

However, as we have seen, both in the case of the ideal¬ 
ized Lebedev grid and the less-than-ideal atom-centered 
PC models, the Hessian eigenvector with the largest cur¬ 
vature corresponds to the total charge (Fig. 4 and also 
Ref. [38]). Thus, in the case of the two-sphere PC fit¬ 
ting, the total charge is reproduced exactly (Q 0 < 10 -5 ), 
while in the atom-centered PC model the total charge 
only slightly deviates from the exact value due the close 
proximity of the vdW grid and slight distortion of the 
total-charge vector Ui from its analytical analog Yo 
(Qo = 0.003 for methanol, Table I). 

Addition of the Lagrange multiplier leads to an ex¬ 
tra eigenvector U 7 that appears in the eigenbasis of 
the Hessian matrix (Table VIII in the supplementary 
material [51]). The curvature along this vector is the 
smallest in the magnitude («7 = —0.009) and the vec¬ 
tor itself primarily depends on the Lagrange multiplier A, 
with only marginal contribution from the PC values. At 
the same time, remaining eigenvectors {u}® =1 preserve 
the structure of the original eigenbasis, with negligible 
contribution from the Lagrange multiplier A (Table VIII 
in the supplementary material[51]). Thus, application 
of the the total charge constraint in addition to already 
strong restraint (imposed by the eigenvector ui) appears 
to be redundant. Moreover, addition of the Lagrange 
multiplier aggravates the rank deficiency of already ill- 
conditioned LS problem. [21, 22] 

Alternatively, the total charge can be constrained by 
incorporating condition on the proper total charge di¬ 
rectly into the LS sum,[l, 4, 23] by eliminating one of 
the charges and setting it to: 


q n = Q n 0 


N-l 

£ 


i 


and modifies the elements of the Hessian matrix: 



Although the solution obtained with this approach 
is numerically equivalent to the solution with Lagrange 
multiplier, regardless which atom has been eliminated 
(Elimination, Q 0 = 0 in Tables I and II), the structure of 
the right singular vectors becomes disrupted, (Fig. 8 in 
the supplementary material[51]) which prevents the ap¬ 
plication of the truncated SVD to improve the numerical 
stability of the solution. 

Given that even for the atom-centered PC/vdW probe 
model the total charge value deviates only very slightly 
from the reference value, it should be possible to correct 
for this deviation without exacerbating the numerical in¬ 
stabilities of the LS problem, e.g. using the total charge 
vector ui. To do that, we convert the SVD solution (eq. 
50) to a system of linear equations: 

q = £ $ U; = Uc, (56) 

i=o ^£ 

Ci 


U T q = c. (57) 

Then, we replace Ui in U T by an all-ones vector 1, and 
set the corresponding coefficient Ci in c to the exact value 
of the molecular total charge Q™ oi : 

U Q o q = C( ?o, (58) 


where 


u «.= 


1 u 2 


U N 


(53) 


(59) 


























































TABLE I. Effect of the rank r (eq. 47), degree n (eq. 46) and type of the charge constraint on the methanol multipole moments 
and the RMSD (kcal/mol) within the Lebedev grid PC model (a = 2 au, n = 1,2 and N = 6,14) with probe sphere Sr (R = 8 
au, T = 194) and atom-centered PC model with vdW-type grid. 


PC model, probe grid 

Details 

Qo 

Qio 

Qn 

Qi-i Q 20 

Q 21 

Q 2-1 

Q 22 

Q 2-2 

RMSD 


eq. 46, n = 1 

0.000 

0.000 

-0.325 

-0.565 0.000 

0.000 

0.000 

0.000 

0.000 

1.762 

S a {N = 6), Sr(T= 194) 

SVD, r = 4 

0.000 

0.000 

-0.323 

-0.562 0.000 

0.000 

0.000 

0.000 

0.000 

1.762 


SVD, r = 6 

0.000 

0.000 

-0.323 

-0.562 -0.881 

0.000 

0.000 

-0.497 

0.000 

1.668 


eq. 46, n = 2 

0.000 

0.000 

-0.325 

-0.565 -0.887 

0.000 

0.000 

-0.503 

2.882 

0.559 

Sa{N = U), Sr(T = 194) 

SVD, r = 9 

0.000 

0.000 

-0.325 

-0.566 -0.881 

0.000 

0.000 

-0.497 

2.865 

0.559 


SVD, r = 14 

0.000 

0.000 

-0.325 

-0.566 -0.881 

0.000 

0.000 

-0.497 

2.865 

0.425 


SVD, r = 6 

0.007 

0.000 

-0.335 

-0.546 -1.020 

0.001 

0.001 

0.110 

2.761 

2.457 


tSVD, r = 5 

0.009 

0.000 

-0.338 

-0.543 -1.060 

-0.002 

0.000 

0.219 

2.688 

2.625 

atom-centered, vdW 

tSVD, r = 4 

0.003 

0.001 

-0.270 

-0.377 0.574 

-0.003 

-0.003 

0.225 

-1.317 

8.729 


Lagrange, Qo = 0 

0.000 

0.000 

-0.332 

-0.543 -0.989 

0.001 

0.001 

0.073 

2.739 

2.587 


Elimination, Qo = 0 

0.000 

0.000 

-0.332 

-0.543 -0.989 

0.001 

0.001 

0.073 

2.739 

2.587 


SVD, Qo = 0 

0.000 

0.000 

-0.331 

-0.544 -1.010 

0.001 

0.001 

0.097 

2.756 

2.597 


Trivial, Qo = 0 

0.000 

0.000 

-0.331 

-0.544 -1.010 

0.001 

0.001 

0.097 

2.755 

2.597 

Reference 

0.000 

0.000 

-0.325 

-0.565 -0.887 

0.000 

0.000 

-0.503 

2.882 

— 


TABLE II. Effect of the numerical rank r (SVD in eq. 50) and the total-charge constraint on the values of atom-centered PCs 
of methanol and the RMSD (kcal/mol). 


qc 

<m a 

QH t 

qo 

qH 

RMSD 

SVD, r = 6 

0.215 

-0.018 

0.048 

-0.592 

0.371 

2.457 

tSVD, r = 5 

-0.058 

0.056 

0.118 

-0.532 

0.370 

2.625 

tSVD, r = 4 

0.007 

0.089 

-0.101 

-0.070 

-0.010 

8.729 

Lagrange, Qo = 0 

0.276 

-0.035 

0.030 

-0.603 

0.367 

2.587 

Elimination, Qo = 0 

0.276 

-0.035 

0.030 

-0.603 

0.367 

2.587 

SVD, Qo = 0 

0.214 

-0.019 

0.047 

-0.593 

0.370 

2.597 

Trivial, Qo = 0 

0.214 

-0.019 

0.047 

-0.593 

0.370 

2.597 


c Qo 


Qr l c 2 ••• c N 


(60) 


This approach does not introduce any redundant con¬ 
straints, preserves the electrostatic properties of the un¬ 
constrained solution, and results only in to minor changes 
in the PC values (SVD, Q 0 = 0 in Tables I and II) and 
is compatible with truncated SVD. Also, the error in the 
total charge value is small enough and can be corrected 
by simply distributing the Q o error correction across the 
atomic charges; this trivial total charge correction gives 
result nearly identical to eq. 58 (Trivial, Qo = 0 in Tables 
I and II). 


VII. CONCLUSIONS 

To understand the origins of the ill-conditioning of the 
least-squares (LS) point charge (PC) fitting problem, we 
revisited the PC representation of the molecular electro¬ 
static potential (MEP) from the first principles, as an 
example of the inverse problem. 


Based on the properties of the Coulomb potential that 
can be expanded in terms of spherical harmonics, we in¬ 
troduce a model where the MEP of a molecule is ex¬ 
actly reproduced by a charged sphere that has the same 
multipole moments Qi m as the molecule. Using Lebe¬ 
dev quadrature this continuous model is converted into 
a discrete PC model, where the PC values are evaluated 
analytically from the multipole moments Qi m up to the 
maximum value determined by the quadrature. 

In this context, the traditional atom-centered PC 
model can be viewed as an irregular numerical quadra¬ 
ture, poorly suited to reproduce the multipolar expansion 
of the MEP. As such, this quadrature only allows integra¬ 
tion of the monopole and, approximately, dipole terms. 
The corresponding large-curvature—or ‘stiff’[32, 33]— 
Hessian eigenvectors u, can still be related to the cor¬ 
responding multipoles Qi m - This explains previously 
observed correspondence between the highest-curvature 
Hessian eigenvectors and the total charge and the dipole 
moment components; [38, 40] this correspondence quickly 
breaks down for the higher multipole moments. 
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This consideration then reveals the origins of the ill- 
conditioning of the PC fitting due to the presence of low- 
curvature—or ‘sloppy’[32, 33]—vectors U;. The intrinsic 
ill-conditioning arises even in the case of the ideal spheri¬ 
cal model: since the higher-rank multipole moments Qi m 
have smaller contribution to the MEP, the singular val¬ 
ues hi decay as l increases. The ill-conditioning is further 
exacerbated in the numerical treatment of the Lebedev 
grid model because the number of PCs does not match 
the dimension of the basis formed by Lebedev quadra¬ 
ture. The remaining singular values/curvatures are even 
lower in magnitude and do not correspond to particular 
multipole moments Qi m . The same rank-deficiency prob¬ 
lems apply to the atom-centered PC grids. However, in 
that case most of the eigenvectors do not have a direct 
correspondence to the multipole moments, which leads to 
even wider spread-out of the singular values/curvatures. 

These insights can suggest several ways to alleviate 
the ill-conditioning of the problem. For instance, the 
buried atom problem can be addressed by truncating the 
sloppy singular vectors with dominant contribution from 
these atom, instead of introducing additional restrain¬ 
ing functions[3, 6, 36, 37] that can negatively affect the 
overall electrostatic properties of the molecule. [22, 56] 
Also, slight deviations of the total charge of the fitted PC 
solution can be fixed by adjusting the stiff total-charge 
vector Ui and the corresponding coordinate Q ™° 1 2 3 4 5 6 7 8 9 , rather 
than introducing a Lagrange multiplier that increases the 
rank-deficiency of the Hessian matrix. [21, 22] 

The results presented here can help further application 
of the PC model in biomolecular simulations. Although 
the force fields using point charges may not be as accu¬ 
rate as the force fields that explicitly include multipoles 
and/or polarization effects, the simplicity and computa¬ 


tional efficiency of the PC model has ensured its con¬ 
tinued survival. [18] In fact, representation of multipoles 
using the Lebedev grid PC model can provide an alterna¬ 
tive to the multipole moment expansion; [57] it also can 
be used to extend recently proposed Distributed Charge 
Model. [19, 20] 

VIII. COMPUTATIONAL DETAILS 

MEP and multipole moments were calculated at the 
B3LYP/aug-cc-pVDZ level[58-61] as implemented in Q- 
Cliem package.[62] For atom-centered PC fitting the ref¬ 
erence MEP was generated as the cubic grid with lin¬ 
ear density 2.8 points/A, followed by the removal of the 
points outside of 1.0-2.0 van der Waals radii range around 
each atom (vdW grid). For the two-sphere PC model 
the Lebedev quadrature rules were used as implemented 
in PyQuante package. [63, 64] Charge fitting proce¬ 
dures were implemented in the in-house developed fftool- 
box Python library. [65] SVD was performed using numpy 
library. [66] Spherical harmonics were accessed from scipy 
library. [67] 
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